As time goes by, the factors of people’s happiness are constantly changing. What is the reason for human beings’ happiness in this age? Let’s find it out by analyzing HappyDB consisted of about 100 thousand happy moments of volunteers.
Part One: Word analysis
In order to get a general idea of happiness reasons, I draw a picture of words frequency. In the following picture, we can find some words, like “friend”, “work” and “time”, have relationship with happiness to some degree. However, this analysis restricts to words and looking at units beyond just words might give us more ideas about people’s happiness.

Part Two: Topic modeling
For more information beyong words, I choose LDA(latent Dirichlet Allocation Model) to analyse data. Before applying the model, I lemmatize words first, which reverts words into roots. Train data with LDA and then get the interactive result in following link: http://localhost:8888/notebooks/Documents/applied_data_science/Project1-RNotebook/lib/LDA-Copy3.ipynb
In the visualization of LDA model, the radius of a bubble represents the frequencuy of the topic and distances between bubbles stand for similarity of topics. For topic 13 in the screen shots, it mainly talks about watching TV showsor moives, which is related to entertainment. In real life, it is also reasonable that relaxing activities could bring happiness to people.
Tpoic 7 is about enjoying time with family. What is interesting here is that “read” and “book” also appear in this topic, which indicates that there is an hidden relationship between spending time with family and reading books.
Topic 2 is more related to finishing work and going back home.
In following topic, “time” and “first” are two top words, so this one might be high related with trying new things.
Topic 4 is a typically food topic. It shows that a delicious dinner or lunch can also make people happy.It is showed in the picture that, “dinner” appears more frequently than “lunch” and “breakfast”, which indicates that people pay more attention to dinner. Anohter information from this plot is people’s favourite food, like ice-cream, pizza and chicken.
The happy source in topic 9 might be getting a job based on words, like “job”, “get” and “interview”.
Part Three: Analyse with additional information
In order to having a deeper understanding to human beings’ happiness, I add other information included in the data file into analysis, such as topic dictionaries and volunteers’ information.
Following picture is a wordcloud about ‘family’ topic. As we can see in the picture, people talk more about daughter and son than husband and wife. What’s more, “mom”, “dad”, “brother” and “sister” appear with less frequency. It is an interesting phemomenon that people get more happiness from their children than their parents.

In the wordcloud of entertainment, the most salient word is “watch” then is “show” and “movies”. These words are as same as the result from LDA model in topic 13. Besides that, “play video games” and “read book” are also important in this wordcloud.

From following picture about exercise, we could summarize that running and walking play a significant role in bringing happiness to people.

As for the topic about pets, unsurprisingly, “cat” and “dog” appear frequently.

About school, people might get happiness from passing exams or doing a good job in study.

After that, I draw a bar plot about different topic. From the plot, the three most popular topics are “family”, “food” and “people”, which means that it is easier to get happiness when doing things related with these thress topics.
------------------------------------------------------------------------------------------
You have loaded plyr after dplyr - this is likely to cause problems.
If you need functions from both plyr and dplyr, please load plyr first, then dplyr:
library(plyr); library(dplyr)
------------------------------------------------------------------------------------------
Attaching package: ‘plyr’
The following objects are masked from ‘package:dplyr’:
arrange, count, desc, failwith, id, mutate, rename, summarise, summarize
Ignoring unknown parameters: binwidth, bins, pad

Part Four: Conlusion
By analysing HappyDB, we now have more ideas about why people are happy.
- Watching movies/ TV shows as entertainment
- Having fun and spending time with family
- Finishing everyday work and going back to home
- Doing things for the first time and learning new things
- having delicious food in dinners or lunches
- Passing an interview and getting a new job
LS0tCnRpdGxlOiAiRGF0YSBTdG9yeSBvbiBIYXBweSBNb21lbnRzIgpvdXRwdXQ6IGh0bWxfbm90ZWJvb2sKLS0tCgpBcyB0aW1lIGdvZXMgYnksIHRoZSBmYWN0b3JzIG9mIHBlb3BsZeKAmXMgaGFwcGluZXNzIGFyZSBjb25zdGFudGx5IGNoYW5naW5nLiBXaGF0IGlzIHRoZSByZWFzb24gZm9yIGh1bWFuIGJlaW5nc+KAmSBoYXBwaW5lc3MgaW4gdGhpcyBhZ2U/IExldOKAmXMgZmluZCBpdCBvdXQgYnkgYW5hbHl6aW5nIEhhcHB5REIgY29uc2lzdGVkIG9mIGFib3V0IDEwMCB0aG91c2FuZCBoYXBweSBtb21lbnRzIG9mIHZvbHVudGVlcnMuCgojUGFydCBPbmU6IFdvcmQgYW5hbHlzaXMKCkluIG9yZGVyIHRvIGdldCBhIGdlbmVyYWwgaWRlYSBvZiBoYXBwaW5lc3MgcmVhc29ucywgSSBkcmF3IGEgcGljdHVyZSBvZiB3b3JkcyBmcmVxdWVuY3kuIEluIHRoZSBmb2xsb3dpbmcgcGljdHVyZSwgd2UgY2FuIGZpbmQgc29tZSB3b3JkcywgbGlrZSAiZnJpZW5kIiwgIndvcmsiIGFuZCAidGltZSIsIGhhdmUgcmVsYXRpb25zaGlwIHdpdGggaGFwcGluZXNzIHRvIHNvbWUgZGVncmVlLiBIb3dldmVyLCB0aGlzIGFuYWx5c2lzIHJlc3RyaWN0cyB0byB3b3JkcyBhbmQgbG9va2luZyBhdCB1bml0cyBiZXlvbmQganVzdCB3b3JkcyBtaWdodCBnaXZlIHVzIG1vcmUgaWRlYXMgYWJvdXQgcGVvcGxlJ3MgaGFwcGluZXNzLgoKYGBge3IgZWNobz1GQUxTRSwgbWVzc2FnZT1GQUxTRSwgd2FybmluZz1GQUxTRX0KbGlicmFyeShkcGx5cikKY2xlYW5fd29yZD1yZWFkLmNzdignLi4vb3V0cHV0L2NsZWFuX3dvcmQuY3N2JykKbGlicmFyeSh3b3JkY2xvdWQpCmNsZWFuX3dvcmQlPiUKICBjb3VudChsZW1fd29yZCklPiUKICB3aXRoKHdvcmRjbG91ZChsZW1fd29yZCxuLG1heC53b3JkcyA9IDEwMCxjb2xvcnM9YnJld2VyLnBhbCg5LCJCbHVlcyIpKSkKYGBgCgojUGFydCBUd286IFRvcGljIG1vZGVsaW5nCgpGb3IgbW9yZSBpbmZvcm1hdGlvbiBiZXlvbmcgd29yZHMsIEkgY2hvb3NlIExEQShsYXRlbnQgRGlyaWNobGV0IEFsbG9jYXRpb24gTW9kZWwpIHRvIGFuYWx5c2UgZGF0YS4gQmVmb3JlIGFwcGx5aW5nIHRoZSBtb2RlbCwgSSBsZW1tYXRpemUgd29yZHMgZmlyc3QsIHdoaWNoIHJldmVydHMgd29yZHMgaW50byByb290cy4gVHJhaW4gZGF0YSB3aXRoIExEQSAgYW5kIHRoZW4gZ2V0IHRoZSBpbnRlcmFjdGl2ZSByZXN1bHQgaW4gZm9sbG93aW5nIGxpbms6IGh0dHA6Ly9sb2NhbGhvc3Q6ODg4OC9ub3RlYm9va3MvRG9jdW1lbnRzL2FwcGxpZWRfZGF0YV9zY2llbmNlL1Byb2plY3QxLVJOb3RlYm9vay9saWIvTERBLUNvcHkzLmlweW5iCgpJbiB0aGUgdmlzdWFsaXphdGlvbiBvZiBMREEgbW9kZWwsIHRoZSByYWRpdXMgb2YgYSBidWJibGUgcmVwcmVzZW50cyB0aGUgZnJlcXVlbmN1eSBvZiB0aGUgdG9waWMgYW5kIGRpc3RhbmNlcyBiZXR3ZWVuIGJ1YmJsZXMgc3RhbmQgZm9yIHNpbWlsYXJpdHkgb2YgdG9waWNzLiBGb3IgdG9waWMgMTMgaW4gdGhlIHNjcmVlbiBzaG90cywgaXQgbWFpbmx5IHRhbGtzIGFib3V0IHdhdGNoaW5nIFRWIHNob3dzb3IgbW9pdmVzLCB3aGljaCBpcyAgcmVsYXRlZCB0byBlbnRlcnRhaW5tZW50LiBJbiByZWFsIGxpZmUsIGl0IGlzIGFsc28gcmVhc29uYWJsZSB0aGF0IHJlbGF4aW5nIGFjdGl2aXRpZXMgY291bGQgYnJpbmcgaGFwcGluZXNzIHRvIHBlb3BsZS4KCiFbV2F0Y2ggVFZdKC4uL2ZpZ3Mvd2F0Y2hfVFYucG5nKQoKVHBvaWMgNyBpcyBhYm91dCBlbmpveWluZyB0aW1lIHdpdGggZmFtaWx5LiBXaGF0IGlzIGludGVyZXN0aW5nIGhlcmUgaXMgdGhhdCAicmVhZCIgYW5kICJib29rIiBhbHNvIGFwcGVhciBpbiB0aGlzIHRvcGljLCB3aGljaCBpbmRpY2F0ZXMgdGhhdCB0aGVyZSBpcyBhbiBoaWRkZW4gcmVsYXRpb25zaGlwIGJldHdlZW4gc3BlbmRpbmcgdGltZSB3aXRoIGZhbWlseSBhbmQgcmVhZGluZyBib29rcy4gCgpUb3BpYyAyIGlzIG1vcmUgcmVsYXRlZCB0byBmaW5pc2hpbmcgd29yayBhbmQgZ29pbmcgYmFjayBob21lLgoKIVtFbmpveSB0aW1lIHdpdGggZmFtaWx5XSguLi9maWdzL2Vuam95X2ZhbWlseS5wbmcpCgohW0ZpbmlzaCB3b3JrXSguLi9maWdzL2ZpbmlzaF93b3JrLnBuZykKCkluIGZvbGxvd2luZyB0b3BpYywgInRpbWUiIGFuZCAiZmlyc3QiIGFyZSB0d28gdG9wIHdvcmRzLCBzbyB0aGlzIG9uZSBtaWdodCBiZSBoaWdoIHJlbGF0ZWQgd2l0aCB0cnlpbmcgbmV3IHRoaW5ncy4gCgpUb3BpYyA0IGlzIGEgdHlwaWNhbGx5IGZvb2QgdG9waWMuIEl0IHNob3dzIHRoYXQgYSBkZWxpY2lvdXMgZGlubmVyIG9yIGx1bmNoIGNhbiBhbHNvIG1ha2UgcGVvcGxlIGhhcHB5Lkl0IGlzIHNob3dlZCBpbiB0aGUgcGljdHVyZSB0aGF0LCAiZGlubmVyIiBhcHBlYXJzIG1vcmUgZnJlcXVlbnRseSB0aGFuICJsdW5jaCIgYW5kICJicmVha2Zhc3QiLCB3aGljaCBpbmRpY2F0ZXMgdGhhdCBwZW9wbGUgcGF5IG1vcmUgYXR0ZW50aW9uIHRvIGRpbm5lci4gQW5vaHRlciBpbmZvcm1hdGlvbiBmcm9tIHRoaXMgcGxvdCBpcyBwZW9wbGUncyBmYXZvdXJpdGUgZm9vZCwgbGlrZSBpY2UtY3JlYW0sIHBpenphIGFuZCBjaGlja2VuLiAKClRoZSBoYXBweSBzb3VyY2UgaW4gdG9waWMgOSBtaWdodCBiZSBnZXR0aW5nIGEgam9iIGJhc2VkIG9uIHdvcmRzLCBsaWtlICJqb2IiLCAiZ2V0IiBhbmQgImludGVydmlldyIuCgohW2ZpcnN0IHRpbWVdKC4uL2ZpZ3MvZmlyc3RfdGltZS5wbmcpCgohW2Zvb2RdKC4uL2ZpZ3MvZm9vZC5wbmcpCgohW0dldCBqb2JdKC4uL2ZpZ3Mvam9iLnBuZykKCgojUGFydCBUaHJlZTogQW5hbHlzZSB3aXRoIGFkZGl0aW9uYWwgaW5mb3JtYXRpb24KCkluIG9yZGVyIHRvIGhhdmluZyBhIGRlZXBlciB1bmRlcnN0YW5kaW5nIHRvIGh1bWFuIGJlaW5ncycgaGFwcGluZXNzLCBJIGFkZCBvdGhlciBpbmZvcm1hdGlvbiBpbmNsdWRlZCBpbiB0aGUgZGF0YSBmaWxlIGludG8gYW5hbHlzaXMsIHN1Y2ggYXMgdG9waWMgZGljdGlvbmFyaWVzIGFuZCB2b2x1bnRlZXJzJyBpbmZvcm1hdGlvbi4KCkZvbGxvd2luZyBwaWN0dXJlIGlzIGEgd29yZGNsb3VkIGFib3V0ICdmYW1pbHknIHRvcGljLiBBcyB3ZSBjYW4gc2VlIGluIHRoZSBwaWN0dXJlLCBwZW9wbGUgdGFsayBtb3JlIGFib3V0IGRhdWdodGVyIGFuZCBzb24gdGhhbiBodXNiYW5kIGFuZCB3aWZlLiBXaGF0J3MgbW9yZSwgIm1vbSIsICJkYWQiLCAiYnJvdGhlciIgYW5kICJzaXN0ZXIiIGFwcGVhciB3aXRoIGxlc3MgZnJlcXVlbmN5LiBJdCBpcyBhbiBpbnRlcmVzdGluZyBwaGVtb21lbm9uIHRoYXQgcGVvcGxlIGdldCBtb3JlIGhhcHBpbmVzcyBmcm9tIHRoZWlyIGNoaWxkcmVuIHRoYW4gdGhlaXIgcGFyZW50cy4KCmBgYHtyIGVjaG89RkFMU0UsIG1lc3NhZ2U9RkFMU0UsIHdhcm5pbmc9RkFMU0V9CnRlbXAyPXJlYWQuY3N2KCcuLi9vdXRwdXQvY2xlYW5fd29yZF90b3BpYy5jc3YnKQpsaWJyYXJ5KGRwbHlyKQpsaWJyYXJ5KHdvcmRjbG91ZCkKdGVtcDJbdGVtcDIkbGFiZWw9PSdmYW1pbHknLF0lPiUKICBjb3VudChsZW1fd29yZCklPiUKICB3aXRoKHdvcmRjbG91ZChsZW1fd29yZCxuLG1heC53b3JkcyA9IDEwMCxjb2xvcnM9YnJld2VyLnBhbCg5LCJPcmFuZ2VzIikpKQpgYGAKCkluIHRoZSB3b3JkY2xvdWQgb2YgZW50ZXJ0YWlubWVudCwgdGhlIG1vc3Qgc2FsaWVudCB3b3JkIGlzICJ3YXRjaCIgdGhlbiBpcyAic2hvdyIgYW5kICJtb3ZpZXMiLiBUaGVzZSB3b3JkcyBhcmUgYXMgc2FtZSBhcyB0aGUgcmVzdWx0IGZyb20gTERBIG1vZGVsIGluIHRvcGljIDEzLiBCZXNpZGVzIHRoYXQsICJwbGF5IHZpZGVvIGdhbWVzIiBhbmQgInJlYWQgYm9vayIgYXJlIGFsc28gaW1wb3J0YW50IGluIHRoaXMgd29yZGNsb3VkLgoKYGBge3IgZWNobz1GQUxTRSwgbWVzc2FnZT1GQUxTRSwgd2FybmluZz1GQUxTRX0KdGVtcDI9cmVhZC5jc3YoJy4uL291dHB1dC9jbGVhbl93b3JkX3RvcGljLmNzdicpCmxpYnJhcnkoZHBseXIpCmxpYnJhcnkod29yZGNsb3VkKQp0ZW1wMlt0ZW1wMiRsYWJlbD09J2VudGVydGFpbm1lbnQnLF0lPiUKICBjb3VudChsZW1fd29yZCklPiUKICB3aXRoKHdvcmRjbG91ZChsZW1fd29yZCxuLG1heC53b3JkcyA9IDEwMCxjb2xvcnM9YnJld2VyLnBhbCg5LCJSZWRzIikpKQpgYGAKCkZyb20gZm9sbG93aW5nIHBpY3R1cmUgYWJvdXQgZXhlcmNpc2UsIHdlIGNvdWxkIHN1bW1hcml6ZSB0aGF0IHJ1bm5pbmcgYW5kIHdhbGtpbmcgcGxheSBhIHNpZ25pZmljYW50IHJvbGUgaW4gYnJpbmdpbmcgaGFwcGluZXNzIHRvIHBlb3BsZS4KCmBgYHtyIGVjaG89RkFMU0UsIG1lc3NhZ2U9RkFMU0UsIHdhcm5pbmc9RkFMU0V9CmxpYnJhcnkoZHBseXIpCmxpYnJhcnkod29yZGNsb3VkKQp0ZW1wMlt0ZW1wMiRsYWJlbD09J2V4ZXJjaXNlJyxdJT4lCiAgY291bnQobGVtX3dvcmQpJT4lCiAgd2l0aCh3b3JkY2xvdWQobGVtX3dvcmQsbixtYXgud29yZHMgPSAxMDAsY29sb3JzPWJyZXdlci5wYWwoOSwiQmx1ZXMiKSkpCmBgYAoKQXMgZm9yIHRoZSB0b3BpYyBhYm91dCBwZXRzLCB1bnN1cnByaXNpbmdseSwgImNhdCIgYW5kICJkb2ciIGFwcGVhciBmcmVxdWVudGx5LgoKYGBge3IgZWNobz1GQUxTRSwgbWVzc2FnZT1GQUxTRSwgd2FybmluZz1GQUxTRX0KbGlicmFyeShkcGx5cikKbGlicmFyeSh3b3JkY2xvdWQpCnRlbXAyW3RlbXAyJGxhYmVsPT0ncGV0cycsXSU+JQogIGNvdW50KGxlbV93b3JkKSU+JQogIHdpdGgod29yZGNsb3VkKGxlbV93b3JkLG4sbWF4LndvcmRzID0gMTAwLGNvbG9ycz1icmV3ZXIucGFsKDksIk9yYW5nZXMiKSkpCmBgYAoKQWJvdXQgc2Nob29sLCBwZW9wbGUgbWlnaHQgZ2V0IGhhcHBpbmVzcyBmcm9tIHBhc3NpbmcgZXhhbXMgb3IgZG9pbmcgYSBnb29kIGpvYiBpbiBzdHVkeS4KCmBgYHtyIGVjaG89RkFMU0UsIG1lc3NhZ2U9RkFMU0UsIHdhcm5pbmc9RkFMU0V9CmxpYnJhcnkoZHBseXIpCmxpYnJhcnkod29yZGNsb3VkKQp0ZW1wMlt0ZW1wMiRsYWJlbD09J3NjaG9vbCcsXSU+JQogIGNvdW50KGxlbV93b3JkKSU+JQogIHdpdGgod29yZGNsb3VkKGxlbV93b3JkLG4sbWF4LndvcmRzID0gMTAwLGNvbG9ycz1icmV3ZXIucGFsKDksIkJsdWVzIikpKQpgYGAKCkFmdGVyIHRoYXQsIEkgZHJhdyBhIGJhciBwbG90IGFib3V0IGRpZmZlcmVudCB0b3BpYy4gRnJvbSB0aGUgcGxvdCwgdGhlIHRocmVlIG1vc3QgcG9wdWxhciB0b3BpY3MgYXJlICJmYW1pbHkiLCAiZm9vZCIgYW5kICJwZW9wbGUiLCB3aGljaCBtZWFucyB0aGF0IGl0IGlzIGVhc2llciB0byBnZXQgaGFwcGluZXNzIHdoZW4gZG9pbmcgdGhpbmdzIHJlbGF0ZWQgd2l0aCB0aGVzZSB0aHJlc3MgdG9waWNzLgoKYGBge3IgZWNobz1GQUxTRSwgbWVzc2FnZT1GQUxTRSwgd2FybmluZz1GQUxTRX0KdGVtcDIkbGFiZWw9YXMuY2hhcmFjdGVyKHRlbXAyJGxhYmVsKQpsaWJyYXJ5KGdncGxvdDIpCmxpYnJhcnkocGx5cikKbGlicmFyeShkcGx5cikKYT1hYXBseSh0ZW1wMiRsYWJlbCwxLG5jaGFyKQp0ZW1wMz10ZW1wMlthPDEzLF0KZ2dwbG90KGRhdGE9dGVtcDMpKwogIGdlb21faGlzdG9ncmFtKG1hcHBpbmc9YWVzKHg9bGFiZWwseT0uLmNvdW50Li4pLHN0YXQ9ImNvdW50IixmaWxsPSJsaWdodGJsdWUiLCBjb2xvdXI9ImJsYWNrIikrCiAgdGhlbWUoYXhpcy50ZXh0LnggPSBlbGVtZW50X3RleHQoc2l6ZSA9IDE1LCBmYWNlID0gImJvbGQiLGhqdXN0PTEsdmp1c3QgPSAxLCBhbmdsZSA9IDQ1KSkKCmBgYAoKCiNQYXJ0IEZvdXI6IENvbmx1c2lvbgoKQnkgYW5hbHlzaW5nIEhhcHB5REIsIHdlIG5vdyBoYXZlIG1vcmUgaWRlYXMgYWJvdXQgd2h5IHBlb3BsZSBhcmUgaGFwcHkuIAoKKyBXYXRjaGluZyBtb3ZpZXMvIFRWIHNob3dzIGFzIGVudGVydGFpbm1lbnQKKyBIYXZpbmcgZnVuIGFuZCBzcGVuZGluZyB0aW1lIHdpdGggZmFtaWx5CisgRmluaXNoaW5nIGV2ZXJ5ZGF5IHdvcmsgYW5kIGdvaW5nIGJhY2sgdG8gaG9tZQorIERvaW5nIHRoaW5ncyBmb3IgdGhlIGZpcnN0IHRpbWUgYW5kIGxlYXJuaW5nIG5ldyB0aGluZ3MKKyBoYXZpbmcgZGVsaWNpb3VzIGZvb2QgaW4gZGlubmVycyBvciBsdW5jaGVzCisgUGFzc2luZyBhbiBpbnRlcnZpZXcgYW5kIGdldHRpbmcgYSBuZXcgam9iCgo=